Segmental Spatiotemporal CNNs for Fine-Grained Action Segmentation

نویسندگان

  • Colin Lea
  • Austin Reiter
  • René Vidal
  • Gregory D. Hager
چکیده

Generalizes our temporal convolutions Unifies our unary and temporal models New results on GTEA, 50 Salads, & JIGSAWS 1) Improved Dense Trajectories (IDT) + Bag of Words do not work well on FGAR 2) Spatial & Spatiotemporal CNNs have been proposed, but IDT (or CNN+IDT) is typically superior [Sun ICCV15, Heilbron CVPR15, Simonyan ICLR15, Jain CVPR15, Tran ICCV15, Karpathy CVPR14, ...] 3) Current action localization results are poor, implying that these models capture scene information but not the essence of what defines an action

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Segmental Spatio-Temporal CNNs for Fine-grained Action Segmentation and Classification

Joint segmentation and classification of fine-grained actions is important for applications in human-robot interaction, video surveillance, and human skill evaluation. However, despite substantial recent progress in large scale action classification, the performance of state-ofthe-art fine-grained action recognition approaches remains low. In this paper, we propose a new spatio-temporal CNN mod...

متن کامل

Face Parsing via Recurrent Propagation

Face parsing is an important problem in computer vision that finds numerous applications including recognition and editing. Recently, deep convolutional neural networks (CNNs) have been applied to image parsing and segmentation with the state-of-the-art performance. In this paper, we propose a face parsing algorithm that combines hierarchical representations learned by a CNN, and accurate label...

متن کامل

Efficient Hardware Realization of Convolutional Neural Networks using Intra-Kernel Regular Pruning

The recent trend toward increasingly deep convolutional neural networks (CNNs) leads to a higher demand of computational power and memory storage. Consequently, the deployment of CNNs in hardware has become more challenging. In this paper, we propose an Intra-Kernel Regular (IKR) pruning scheme to reduce the size and computational complexity of the CNNs by removing redundant weights at a fine-g...

متن کامل

A Closer Look at Spatiotemporal Convolutions for Action Recognition

In this paper we discuss several forms of spatiotemporal convolutions for video analysis and study their effects on action recognition. Our motivation stems from the observation that 2D CNNs applied to individual frames of the video have remained solid performers in action recognition. In this work we empirically demonstrate the accuracy advantages of 3D CNNs over 2D CNNs within the framework o...

متن کامل

An improved joint model: POS tagging and dependency parsing

Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016